Encoding Lexicalized Tree Adjoining Grammars with a Nonmonotonic Inheritance Hierachy
نویسندگان
چکیده
This paper shows how DATR, a widely used formal language for lexical knowledge representation , can be used to define an I_TAG lexicon as an inheritance hierarchy with internal lexical rules. A bottom-up featu-ral encoding is used for LTAG trees and this allows lexical rules to be implemented as covariation constraints within feature structures. Such an approach eliminates the considerable redundancy otherwise associated with an LTAG lexicon. 1 Introduction The Tree Adjoining Grammar (lAG) formalism was first introduced two decades ago (3oshi et al., 1975), and since then there has been a steady stream of theoretical work using the formalism. But it is only more recently that grammars of non-trivial size have been developed: Abeille, Bishop, Cote & Scha-bes (1990) describe a feature-based Lexicalized Tree Adjoining Grammar ([_'lAG) for English which subsequently became the basis for the grammar used in the XTAG system, a wide-coverage [_TAG parser (Do-ran et al., 1994b; Doran et al., 1994a; XTAG Research Group, 1995). The advent of such large grammars gives rise to questions of efficient representation , and the fully lexicalized character of the [TAG formalism suggests that recent research into lexical representation might be a place to look for answers (see for example Briscoe ef a/.(1993); Daelemans & Gazdar(1992)). In this paper we explore this suggestion by showing how the lexical knowledge representation language (LKRL) DA'lR (Evans & Gazdar, 1989a; Evans & Gazdar, 1989b) can be used to formulate a compact, hierarchical encoding of an [-'lAG. The issue of efficient representation for I_'rAG 1 is discussed by Vijay-Shanker & Schabes (1992), who 1As with all fully lexicMized grammar formalisms, there is really no conceptual distinction to be drawn in I_TAG between the lexicon and the grammar: tile gram-rnatical rules are just lexical properties. draw attention to the considerable redundancy inherent in [-TAG lexicons that are expressed in a flat manner with no sharing of structure or properties across the elementary trees. For example, XTAG currently includes over 100,000 lexemes, each of which is associated with a family of trees (typically around 20) drawn from a set of over 500 elementary trees. Many of these trees have structure in common, many of the lexemes have the same tree families, and many of the trees within families are systematically related in ways which other formalisms capture using transformations or metarules. However, the [TAG formalism itself does not provide any direct support for capturing such regularities. …
منابع مشابه
Encoding Lexicalized Tree Adjoining Grammars with a Nonmonotonic Inheritance Hierarchy
This paper shows how DATR, a widely used formal language for lexical knowledge representation, can be used to define an LTAG lexicon as an inheritance hierarchy with internal lexical rules. A bottom-up featural encoding is used for LTAG trees and this allows lexical rules to be implemented as covariation constraints within feature structures. Such an approach eliminates the considerable redunda...
متن کاملResources for Lexicalized Tree Adjoining Grammars and XML Encoding: TagML
This work addresses both practical and theorical purposes for the encoding and the exploitation of linguistic resources for feature based Lexicalized Tree Adjoining grammars (LTAG). The main goals of these specifications are the following ones: 1. Define a recommendation by the way of an XML (Bray et al., 1998) DTD or schema (Fallside, 2000) for encoding LTAG resources in order to exchange gram...
متن کاملExtraction of Tree Adjoining Grammars from a Treebank for Korean
We present the implementation of a system which extracts not only lexicalized grammars but also feature-based lexicalized grammars from Korean Sejong Treebank. We report on some practical experiments where we extract TAG grammars and tree schemata. Above all, full-scale syntactic tags and well-formed morphological analysis in Sejong Treebank allow us to extract syntactic features. In addition, ...
متن کاملEncoding Frequency Information in Lexicalized Grammars
We address the issue of how to associate frequency information with lexicalized grammar formalisms, using Lexicalized Tree Adjoining Grammar as a representative framework. We consider systematically a number of alternative probabilistic frameworks, evaluating their adequacy from both a theoretical and empirical perspective using data from existing large treebanks. We also propose three orthogon...
متن کاملStructure Sharing in Lexicalized Tree-Adjoining Grammars
We present a scheme for efficiently representing a lexicaiized tree-adjoining grammar (LTAG). The propcoed representational scheme allows for structure-sharing between lexical entries and the trees associated with the lexical items. A compact organization is achieved by organizing the lexicon in a hierarchical fashion and using inheritance as well as by using lexical and syntactic rules. While ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 1995